In an ideal database CPU and I/O should be the top wait events.
If there are events like TX row lock contention and latch free then that means there is contention in your database.
The db file sequential read (which means index reads) should be higher as compared to db file scattered read (which means full table scans).
The following are 10 of the most common causes for wait events, along with explanations and potential solutions:
Cell Smart Table Scan The cell smart table scan event is what Oracle uses to account for time spent waiting for Full Table Scans that are offloaded. It is the most important new event on the Exadata platform. Its presence or absence can be used to verify whether a statement benefitted from Offloading or not.
Offloading only occurs when Oracle is able to do direct path reads. Consequently, this event replaces the direct path read event in most cases on Exadata. As with normal direct path reads, data is returned directly to the PGA of the requesing process on the database server. Blocks are not returned to the buffer cache
Exadata Smart Scan is a feature of the Database Machine. It offloads the data search and retrieval processing to the storage cell. Exadata Cell evaluate query predicates at the storage level to optimize the performance of certain classes of bulk data processing. For example the performance of queries that require a full table or index scans to evaluate selective predicates can be improved by pushing the database expression evaluations to the storage cell.
0. Cell Single Block Physical Read. This event is equivalent to the db file sequential read event used on non-Exadata platforms. Single block reads are used most often for index access paths (both the index block reads and the table block reads via rowids from the index lookups). They can also be used for a wide variety of other operations where it makes sense to read a single block.
If these are the only I/O wait events you see (and not together with multiblock reads) then it appears you are not using a full segment scan at all. Of course, sometimes single block reads show up due to other operations in the execution plan (like some index range scan) or due to chained rows in data blocks.
Exadata provides a large amount of flash cache (384G) on each storage cell. For that reason, physical reads (both multi-block and single-block) are considerably faster than on most disk-based storage systems.
1. DB File Scattered Read. This generally indicates waits related to full table scans. As full table scans are pulled into memory, they rarely fall into contiguous buffers but instead are scattered throughout the buffer cache. A large number here indicates that your table may have missing or suppressed indexes.
2. DB File Sequential Read. This event generally indicates a single block read (an index read, for example). A large number of waits here could indicate poor joining orders of tables, or unselective indexing. It is normal for this number to be large for a high-transaction, well-tuned system, but it can indicate problems in some circumstances. You should correlate this wait statistic with other known issues within the Statspack report, such as inefficient SQL.
These circumstances are usually interrelated. When they occur in conjunction with the appearance of the db file scattered read and db file sequential read in the Top 5 Wait Events section.
3. Free Buffer. This indicates your system is waiting for a buffer in memory, because none is currently available.
Waits in this category may indicate that you need to increase the DB_BUFFER_CACHE, if all your SQL is tuned.
Free buffer waits could also indicate that unselective SQL is causing data to flood the buffer cache with index blocks, leaving none for this particular statement that is waiting for the system to process. This normally indicates that there is a substantial amount of DML (insert/update/delete) being done and that the Database Writer (DBWR) is not writing quickly enough; the buffer cache could be full of multiple versions of the same buffer, causing great inefficiency.
To address this, you may want to consider accelerating incremental checkpointing, using more DBWR processes, or increasing the number of physical disks.
To investigate if this is an I/O problem, look at the File I/O Statistics.
4. Buffer Busy. This is a wait for a buffer that is being used in an unshareable way or is being read into the buffer cache.
Buffer busy waits should not be greater than 1%.
Check the Buffer Wait Statistics section (or V$WAITSTAT) to find out the type of wait:
To prevent buffer busy waits related to data blocks, you can also use a smaller block size: fewer records fall within a single block in this case, so it's not as "hot."
When a DML occurs, Oracle Database writes information into the block, including all users who are "interested" in the state of the block (Interested Transaction List, ITL).
To decrease waits in this area:
Buffer busy wait happens when a session wants to access a database block in the buffer cache, but it cannot as the buffer is "busy". This indicates that there are some buffers in the buffer cache that multiple processes are attempting to either access concurrently while its being read from disk or waiting for another session's block change to complete.
The two main cases where this can occur are the following:
AWR/statspack report top timed event shows significant percentage of database time spent on this wait event.
Concurrent INSERTs with a suboptimal freelist configuration can lead to buffer busy wait contention as multiple sessions attempt to insert data into the same block (because it appears on the freelist to them). Heavy INSERT activity by concurrent sessions can cause multiple sessions to attempt their insert into the same blocks because automatic segment space management (ASSM) is NOT used, AND there is only a single freelist, too few process freelists, and/or no freelist groups.
The best solution is to use ASSM since it is sometimes tricky to arrive at a correct freelist or freelist group setting. Adding process freelists will help remove contention as each process will map to separate blocks. Freelists can be added at any time without rebuilding the table. Adding freelist groups will also remove contention by mapping processes to other freelists. This is of greatest benefit in RAC environments where the freelist group block itself will be associated with an instance, but will still help in single instance environments as well. The table must be rebuilt to change the freelist group setting.
Concurrent INSERTs or updates may see contention when a related index has a key that is constantly increasing (e.g., a key based on a sequence number). Index leaf blocks may see contention due to key values that are increasing steadily (using a sequence) and concentrated in a leaf block on the "right-hand side" of the index. Look at using reverse key indexes (if range scans aren't commonly used against the segment). A reverse key index will spread keys around evenly and avoid creating these hot leaf blocks. However, the reverse key index will not be usable for index range scans, so care must be taken to ensure that access is normally done via equality predicates. Eliminate HOT blocks access from the application. Many concurrent physical reads against the same blocks will result in buffer busy waits as one session gets to do the actual physical read, and the others will be blocked by the buffer busy wait event until the read completes. This is usually an indication that the SQL statement must be tuned. Oracle's SQL Tuning Advisor can help tune specific SQL statements quickly and easily if you are licensed to use the Enterprise Manager Tuning Pack.
Use automatic undo management or add more rollback segments. The waits can be amplified greatly when physical reads are slow due to poor I/O subsystem performance.
Once you have applied the changes to resolve the issues you have found, compare the latest AWR to the AWR that led you here via Guided Resolution (that AWR becomes your baseline). Look at the percentage decrease total wait time for this event. If there are still issues, re- evaluate those and address them according to the specific symptom.
Automatic segment space management (ASSM) is a simpler and more efficient way of managing space within a segment. It eliminates any need to specify and tune the pctused, freelists, and freelist groups storage parameters for schema objects created in the tablespace.
If any of these attributes are specified, they are ignored.
Note:
Note:
Block change tracking buffer space
When the session is waiting for the block change tracking buffer space event, it indicates that there is space waiting in the CTRW dba buffer.
Several causes for this wait event are:
Workings of CTWR:
Library Cache Lock
The library cache load lock Oracle metric occurs when the process is waiting for the opportunity to load an object or a piece of an object into the library cache. The loads of SQL statements are serialized and locked in exclusive mode, such that only one process can load an object or a piece of an object at a time. In general, all library cache waits are associated with non-reentrant SQL or an undersized shared pool.
In general the library cache load lock occurs during periods of high activity within the library cache, especially with databases that do not use host variables and have a hard parsing due to non-reusable SQL.
5. Latch Free. Latches are low-level queuing mechanisms (they're accurately referred to as mutual exclusion mechanisms) used to protect shared memory structures in the system global area (SGA).
Latches are like locks on memory that are very quickly obtained and released. They are used to prevent concurrent access to a shared memory structure. If the latch is not available, a latch free miss is recorded.
Most latch problems are related to:
There are also latch waits related to bugs; check MetaLink for bug reports if you suspect this is the case.
When latch miss ratios are greater than 0.5%, you should investigate the issue.
If latch free waits are in the Top 5 Wait Events or high in the complete Wait Events list, look at the latch-specific sections of the AWR report to see which latches are contended for.
6. Enqueue. An enqueue is a lock that protects a shared resource.
Locks protect shared resources, such as data in a record, to prevent two people from updating the same data at the same time.
An enqueue includes a queuing mechanism, which is FIFO (first in, first out).
Note: Oracle's latching mechanism is not FIFO. Enqueue waits usually point to the following enqueues:
TX4:Are the most common enqueue waits and they are usually the result of one of three issues:
Given the following situation. This information can be retrieved from a statspack, ADDM, ASH report or v$session_wait query:
Event % Wait Time P1 Parameter P1 Value P2 Parameter P2 Value Parameter 3 P3 Value enq: HW - contention 62,81 name|mode 1213661190 table space 4 # block 17005691
1. Determine the correct file and block number:
select DBMS_UTILITY.DATA_BLOCK_ADDRESS_FILE(17005691) FILE#, DBMS_UTILITY.DATA_BLOCK_ADDRESS_BLOCK(17005691) BLOCK# from dual; FILE# BLOCK# ---------- ---------- 4 228475
For bigfile tablespaces, do not use DBMS_UTILITY.DATA_BLOCK_ADDRESS_FILE, or you will get wrong results. Reference: https://docs.oracle.com/cd/E18283_01/appdev.112/e16760/d_util.htm#i1002531 In such a case, just use the tablespace# and assume p3 is the block number (there is no relative file number). 2. Determine the object to which this block belongs to
select owner, segment_type, segment_name from dba_extents where file_id = 4 and 228475 between block_id and block_id + blocks - 1; OWNER SEGMENT_TYPE SEGMENT_NAME --------------- --------------- ------------------------------ SCOTT LOBSEGMENT EMP_DATA_LOB
Additionally, if the lock contention is currrently observed, we can find out the underlying segment using the following query:
select DBMS_UTILITY.DATA_BLOCK_ADDRESS_FILE(ID2) FILE#, DBMS_UTILITY.DATA_BLOCK_ADDRESS_BLOCK(ID2) BLOCK# from v$lock where type = 'HW';
As the 'enq HW - contention' may be caused by a number of different reasons, there are also several possible different solutions to alleviate or reduce contention.
Things to check are:-
In the case of point 4. there are a couple of options that may be able to be employed to provide either temporary relief or a workaround for the problem
ALTER TABLE MODIFY LOB () (allocate extent (size ));
** The following ALERT should be READ before manually allocating space to a LOB Segment
- NOTE 1229669.1 Bug 8198906 - Segment header corruption if extent allocation operation is interrupted
ALTER TABLE test_lob MODIFY LOB (image) (SHRINK SPACE);
** The following documents should be READ before performing a LOB Shrink Operations
- Bug 5636728 - LOB corruption / ORA-1555 when reading LOBs after a SHRINK operation (Doc ID 5636728.8)
- Bug 5768710 - ALTER TABLE SHRINK slow with LOB (Doc ID 5768710.8)
This can be enabled by setting event 44951 to a value between 1 and 1024 (default is 1). With the value between 1 and 1024 setting the number of chunks to be cleaned up each time a chunk reclaimation operation occurs. This can therefore reduce the number of requests for the High Watermark Enqueue.
EVENT="44951 TRACE NAME CONTEXT FOREVER, LEVEL < 1 - 1024 >"
Refer to NOTE 6376915.8 "Bug 6376915 HW enqueue contention for ASSM LOB segments"
With Manual Segment Space Management, this value cannot be altered and is fixed at 128.
7. Log Buffer Space. This wait occurs because you are writing the log buffer faster than LGWR can write it to the redo logs, or because log switches are too slow.
To address this problem:
8. Log File Switch. All commit requests are waiting for logfile switch (archiving needed) or logfile switch (checkpoint Incomplete).
9. Log File Sync. When a user commits or rolls back data, the LGWR flushes the session's redo from the log buffer to the redo logs. The log file sync process must wait for this event to successfully complete.
To reduce wait events here:
10. Idle Event. There are several idle wait events listed after the output; you can ignore them. Idle events are generally listed at the bottom of each section and include such things as SQL*Net message to/from client and other background-related timings. Idle events are listed in the stats$idle_event table.
11. Direct path writes: They occur when performing appends or data loads. In addition, they occur when you see full table scans or fast full index scans as of Oracle 11g
12. Direct path reads: Occur when there is a lot of parallel query activity
13. db file parallel writes / read: Occurs when there is a lot of partition activity; it could be a table or index partition
14. db file single write: Occurs if there is a high number of data files
15. Direct path read temp or Direct path write temp: Often this event is a result of an undersized temporary tablespace
These two events are triggered when a session is reading or writing buffers to or from temporay segments. These reads and writes are directly to or from the PGA. These occur because work areas are too large to fit in memory and are written to disk
These are the biggest waits for large data warehouse sites.
However, if the workload is not a DSS workload then examine why this is happening.
You can do so by taking the following actions:
16. SQL*Net more message from dblink:A session waits while the server process receives messages over a database link from another server process.
This time could go up because:
This wait blocks until a message is received from the remote connection (or until an abnormal end of file condition occurs on the underlying Net transport layer).
The blocker is the network plus the remote process. If the remote process is another database instance (accessed via a database link) then the information in View:V$SESSION on the REMOTE database can be used to help find the remote session for the database link connection. You should look at that remote session and determine where it is spending time. If system wide waits for this event are significant it is best to determine where the remote connections are to and switch attention to the remote instance / instances to determine where they are spending time.
One can also look at:
Sessions with high values in View:V$SESSTAT for:
The Network between the local and remote systems (problems are usually related to time spent ON the remote instance rather than in the network but it is worth checking if the network between the instances is slow / not)
17. SQL*Net more data from client:Client is sending data to shadow that spans packets (think large data inserts, possibly large pre blocks, large SQL statements) Shadow waits for next packet.
There are a number of reasons why this might be happening; however, a few of them are:
A chatty application may use very frequent round trips to deliver data. Maybe the application design could be improved first?
Buffer Type Waits: So what's going on in your memory
Log Type Waits
18. Read by other session
When a session waits on the �read by other session� event, it indicates a wait for another session to read the data from disk into the Oracle buffer cache because the block you need is not in the cache.
Excessive waits for this event are typically due to several processes repeatedly reading the same blocks, for example, sessions scanning the same index or performing tables scans on the same blocks.
The following query will give the blocks experiencing this wait event:
SELECT p1 "file#", p2 "block#", p3 "class#" FROM v$session_wait WHERE event = 'read by other session';
The query below will give the name and object type involved:
SELECT relative_fno, owner, segment_name, segment_type FROM dba_extents WHERE file_id = &file AND &block BETWEEN block_id AND block_id + blocks - 1;
Depending on the Oracle database environment and specific performance situation the following variety of methods can be used to eliminate contention:
19. Row Cache Lock. The Row Cache or Data Dictionary Cache is a memory area in the shared pool that holds data dictionary information. Row cache holds data as rows instead of buffers. A Row cache enqueue lock is a lock on the data dictionary rows. It is used primarily to serialize changes to the data dictionary and to wait for a lock on a data dictionary cache. The enqueue will be on a specific data dictionary object. This is called the enqueue type and can be found in the v$rowcache view. Waits on this event usually indicate some form of DDL occurring, or possibly recursive operations such as storage management, sequence numbers incrementing frequently, etc. Diagnosing the cause of the contentionLatches are low-level queuing mechanisms (they're accurately referred to as mutual exclusion mechanisms) used to protect shared memory structures in the system global area (SGA).
Issues by Row Cache Lock Enqueue Type: For each enqueue type, there are a limited number of operations that require each enqueue. The enqueue type therefore may give an indication as the type of operation that may be causing the issue. As such some common reasons are outlined below along with SQL that helps to find the qnqueue type:
select * from v$rowcache where cache# IN (select P1 from dba_hist_active_sess_history where sample_time between to_date('26-MAR-14 12:49:00','DD-MON-YY HH24:MI:SS') and to_date('26-MAR-14 12:54:00','DD-MON-YY HH24:MI:SS') and event = 'row cache lock' ) /
DC_SEQUENCES Caused by using sequences in simultaneous insert operations. => Consider caching sequences using the cache option. Especially important on RAC instances! Bug 6027068: Contention on ORA_TQ_BASE sequence -fixed in 10.2.0.5 and 11.2.0.1
DC_OBJECTS Look for any object compilation activity which might require an exclusive lock, blocking other activities. If object compiles are occurring this can require an exclusive lock which will block other activity. Tune by examining invalid objects and dependencies with following SQL:
select * from dba_objects order by last_ddl_time desc; select * from dba_objects where status = 'INVALID';
Can be a bug like the following ones: Bug 11070004: High row cache objects latch contention w/ oracle text queries Bug 11693365: Concurrent Drop table and Select on Reference constraint table hangs(deadlock) � fixed in 12.1 DC_SEGMENTS This is most likely due to segment allocation. Identify what the session holding the enqueue is doing and use errorstacks to diagnose.
DC_USERS + This may occur if a session issues a GRANT to a user, and that user is in the process of logging on to the database. + Excessive calls to dc_users can be a symptom of �set role XXXX� + You can check the presents of massive login attempts, even the failed ones by analyzing listener.log (use OEM 12c-> All Metrics or by checking database AUDIT if available or using own tools). : Bug 7715339 - Logon failures causes row cache lock waits - Allow disable of logon delay
DC_TABLESPACES Probably the most likely cause is the allocation of new extents. If extent sizes are set low then the application may constantly be requesting new extents and causing contention. Do you have objects with small extent sizes that are rapidly growing? (You may be able to spot these by looking for objects with large numbers of extents). Check the trace for insert/update activity, check the objects inserted into for number of extents.
DC_USED_EXTENTS and DC_FREE_EXTENTS This row cache lock wait may occur similar during space management operations where tablespaces are fragmented or have inadequate extent sizes. Tune by checking whether tablespaces are fragmented, extent sizes are too small, or tablespaces are managed manually.
DC_ROLLBACK_SEGMENTS + This is due to rollback segment allocation. Just like dc_segments, identify what is holding the enqueue and also generate errorstacks. Possible Bugs: + Bug 7313166 Startup hang with self deadlock on dc_rollback_segments (Versions BELOW 11.2) + Bug 7291739 Contention Under Auto-Tuned Undo Retention (Doc ID 742035.1)
DC_TABLE_SCNS Bug 5756769: Deadlock between Create MVIEW and DML - fixed in 10.2.0.5 ,11.1.07 and 11.2.0.1
DC_AWR_CONTROL This enqueue is related to control of the Automatic Workload Repository. As such any operation manipulating the repository may hold this so look for processes blocking these.
20. OS Thread Startup. This wait event might be seen if the database server is executing on a platform that supports multi-threading. We enter this waiting state while a thread is starting up and leave the wait state when the thread has started or the startup request is cancelled. This indicates some high contention at OS level avoiding even new process startup.
In the event of parallel execution, itf the event "os thread startup" becomes significant, then you may want to increase the value of the database parameter parallel_min_servers. This parameter determines the number of parallel servers that remain ready for new requests. The event "os thread startup" indicates that a SQL statement is waiting for parallel servers to be allocated, which takes time. After an operation is done using a parallel server, it will remain idle for 5 minutes. If no other SQL statement uses it, then the parallel server will be de-allocated after 5 idle minutes. It is recommended to change the value of parallel_min_servers from the default unless the event "os thread startup" has a significant amount of wait time.
rdbms ipc reply: Oracle metric event is used to wait for a reply from one of the background processes.
PX Deq wait events Last update (2013-10-13 17:38:17) Date added (2010-08-05 13:08:09) Summary The main wait events related to parallel processing are: PX Deq: Table Q Normal PX Deq: Execute Reply PX Deq Credit: send blkd PX Deq: Table Q Normal Indicates that the slave wait for data to arrive on its input table queue. In a parallel execution environment we have a producer-consumer model. One slave set works on the data ( e.g. read data from disk , do a join ) called the producer slave set and the other slave set waits to get the data so can start the work. The slaves in this slave set are called consumer. The wait event "PX Deq: Table Q Normal" means that the slaves in the consumer slave have to wait for rows (data) from the other slave set so they can start their work. PX Deq: Execute Reply The QC is expecting a response (acknowledgment) to a control message from the slaves or is expecting to dequeue data from the producer slave set. This means he waits that the slaves finished to execute the SQL statement and that they send the result of the query back to the QC. PX Deq Credit: send blkd The wait events "PX Deq Credit: need buffer" and "PX Deq Credit: send blkd" are occur when data or messages are exchanged between process that are part of a px query.
Buffer Exterminate - Buffer exterminate wait event is caused when using Oracle's "automatic memory management" (AMM) when the MMON process shrinks the data buffer cache to re-allocate that RAM for another SGA region. The experience has indicated that AMM resize operations can hurt overall database performance especially the OLTP environments, and you may want to consider turning off AMM which will relieve the buffer exterminate waits, and manually adjusting your SGA regions.
If you see this in the TOP 5 Times Events, One has to look into v$sga_resize_ops and v$memory_resize_ops and see how many times it is occurring and effecting the performance of your database. If one sees more events of these and especially during your peak times of database one has to turn of the feature adjusting manually the corresponding SGA and PGA sizes.
If you want to analyze Oracle's use of memory and look at various memory resizing operations you can use the V$MEMORY_RESIZE_OPS view. This view contains a list of the last 800 SGA resize requests handled by Oracle. Here is an example:
SELECT parameter, initial_size, target_size, start_time FROM v$memory_resize_ops WHERE initial_size > = 0 and final_size > = 0 ORDER BY parameter, start_time;
This shows that Oracle has made a number of changes to the database cache and the shared pool, over a pretty short period of time. These changes will often decrease as the database stays up for a longer period of time, and you will often see changes as the load profile of the database changes, say from being report heavy to OLTP heavy. Please find below the various tables and thier descriptions to check the information in the database.
TCP Socket (KGAS)
KGAS is a element in the server which handles TCP/IP sockets which is typically used in dedicated connections i.e. by some PLSQL built in packages such as UTL_HTTP and UTL_TCP.
A session is waiting for an external host to provide requested data over a network socket.
The time that this wait event tracks does not indicate a problem, and even a long wait time is not a reason to contact Oracle Support.
It naturally takes time for data to flow between hosts over a network, and for the remote aspect of an application to process any request made to it. An application that communicates with a remote host must wait until the data it will read has arrived.
From an application/network point of view, delays in establishing a network connection may produce unwanted delays for users. We should make sure that the application makes network calls efficiently and that the network is working well such that these delays are minimized.
From the database point of view, these waits can safely be ignored; the wait event does not represent a database issue. It merely reports the total elapsed time for a network connection to be established or for data to arrive from over the network. The database waits for the connection to be established and reports the time taken. Its always good to check with the network or the third-party application vendors to investigate the underlying socket.
But in case of systemwide waits – If the TIME spent waiting for this event is significant then it is best to determine which sessions are showing the wait and drill into what those sessions are doing as the wait is usually related to whatever application pre is executing eg: What part of the application may be using UTL_HTTP or similar and is experiencing waits. This statement can be used to see which sessions may be worth tracing.
SELECT sid, total_waits, time_waited FROM v$session_event WHERE event='TCP Socket (KGAS)' and total_waits>0 ORDER BY 3,2;
In order to reduce these waits or to help find the origin of the socket operations try:
Example:
Execute the following SQL from a session on a dedicated connection and then check the resulting trace file to see “TCP Socket (KGAS)” waits:
alter session set events '10046 trace name context forever, level 8';
SQL*Net message from dblinkis about latency via a database link in a distributed environment.
It generally means that your local system is waiting on the network to transfer the data across the network.
It is a very normal wait event for this sort of query.
This time could go up for these reasons:
But in case of systemwide waits – If the TIME spent waiting for this event is significant then it is best to determine which sessions are showing the wait and drill into what those sessions are doing as the wait is usually related to whatever application pre is executing eg: What part of the application may be using a databade link and is experiencing waits. This statement can be used to see which sessions may be worth tracing.